Sensitivity of Nonlinear Network Training to Affine Transformed Inputs
نویسندگان
چکیده
In this paper, the effects of nonsingular affine transforms on various nonlinear network training algorithms are analyzed. It is shown that gradient related methods, are quite sensitive to an input affine transform, while Newton related methods are invariant. These results give a connection between pre-processing techniques and weight initialization methods. They also explain the advantages of Newton related methods over other algorithms. Numerical results validate the theoretical analyses. Introduction Nonlinear networks, such as multi-layer perceptron (MLP) and radial basis function (RBF) networks, are popular in the signal processing and pattern recognition area. The networks’ parameters are adjusted to best approximate the underlying relationship between input and output training data (Vapnik 1995). Neural net training algorithms are time consuming, in part due to ill-conditioning problems (Saarinen, Bramley, & Cybenko 1993). Pre-processing techniques, such as input feature-decorrelation , whitening transform (LeCun et al. 1998), (Brause & Rippl 1998) are suggested to alleviate ill-conditioning and thus accelerate the network training procedure (Š. Raudys 2001). Other input transforms, including input re-scaling (Rigler, Irvine, & Vogl 1991), unbiasing and normalization, are also widely used to equalize the influence of the input features. In addition, some researchers think of the MLP as a nonlinear adaptive filter (Haykin 1996). Linear pre-processing techniques, such as noise cancellation, can improve the performance of this non-linear filter. Although widely used, the effects of these linear transforms on MLP training algorithms haven’t been analyzed in detail. Interesting issues, such as (1)whether these pre-processing techniques have same effects on different training algorithms, (2) whether the benefits of these preprocessing can be duplicated or cancelled out by other strategies, i.e., advanced weight initialization method, still need ∗This work was supported by the Advanced Technology Program of the state of Texas, under grant number 003656-0129-2001. Copyright c © 2005, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. further research. In a previous paper (Yu, Manry, & Li 2004), the effects of an input orthogonal transform on the conjugate gradient algorithm was analyzed using the concept of equivalent states. We show that the effect of input orthogonal transform can be absorbed by proper weight initialization strategy. Because all linear transforms can be expressed as an affine transform, in this paper, we analyze the influence of the more general affine transform on typical training algorithms. First, the conventional affine transform is described. Second, typical training algorithms are briefly reviewed. Then, the sensitivity of various training algorithms to input affine transforms are analyzed. Numerical simulations are presented to verify the theoretical results. General Affine Transform Suggested pre-processing techniques, such as feature decorrelation, whitening, input unbiasing and normalization can all be put into the form of nonsingular affine transform: zp = Ax T p + b (1) where xp is the original pth input feature vector, zp is the affine transformed input, and b = [ b1 b2 · · · bN ] . For example, unbiasing the input features is modelled as zp = x T p −mx (2) An affine transform of particular interest in this paper can be expressed as a linear transform by using extended input vectors as: Zp = AeX T p (3) where Zp = [zp1 · · · zpN , 1], Xp = [xp1 · · ·xpN , 1] and Ae = a11 · · · a1N b1 .. . . . .. .. aN1 · · · aNN bN 0 · · · 0 1 , (4) Conventional Training Algorithms Generally, training algorithms for nonlinear networks can be classified into three categories: gradient descent methods, conjugate gradient methods, and Newton related methods (Haykin 1999), (Møller 1997). In the following, we give a brief review for each method.
منابع مشابه
Passivity-Based Stability Analysis and Robust Practical Stabilization of Nonlinear Affine Systems with Non-vanishing Perturbations
This paper presents some analyses about the robust practical stability of a class of nonlinear affine systems in the presence of non-vanishing perturbations based on the passivity concept. The given analyses confirm the robust passivity property of the perturbed nonlinear systems in a certain region. Moreover, robust control laws are designed to guarantee the practical stability of the perturbe...
متن کاملIterative learning identification and control for dynamic systems described by NARMAX model
A new iterative learning controller is proposed for a general unknown discrete time-varying nonlinear non-affine system represented by NARMAX (Nonlinear Autoregressive Moving Average with eXogenous inputs) model. The proposed controller is composed of an iterative learning neural identifier and an iterative learning controller. Iterative learning control and iterative learning identification ar...
متن کاملPREDICTION OF NONLINEAR TIME HISTORY DEFLECTION OF SCALLOP DOMES BY NEURAL NETWORKS
This study deals with predicting nonlinear time history deflection of scallop domes subject to earthquake loading employing neural network technique. Scallop domes have alternate ridged and grooves that radiate from the centre. There are two main types of scallop domes, lattice and continuous, which the latticed type of scallop domes is considered in the present paper. Due to the large number o...
متن کاملNeural Network Sensitivity to Inputs and Weights and its Application to Functional Identification of Robotics Manipulators
Neural networks are applied to the system identification problems using adaptive algorithms for either parameter or functional estimation of dynamic systems. In this paper the neural networks' sensitivity to input values and connections' weights, is studied. The Reduction-Sigmoid-Amplification (RSA) neurons are introduced and four different models of neural network architecture are proposed and...
متن کاملAN OBSERVER-BASED INTELLIGENT DECENTRALIZED VARIABLE STRUCTURE CONTROLLER FOR NONLINEAR NON-CANONICAL NON-AFFINE LARGE SCALE SYSTEMS
In this paper, an observer based fuzzy adaptive controller (FAC) is designed fora class of large scale systems with non-canonical non-affine nonlinear subsystems. It isassumed that functions of the subsystems and the interactions among subsystems areunknown. By constructing a new class of state observer for each follower, the proposedconsensus control method solves the problem of unmeasured sta...
متن کامل